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ABSTRACT 


In this paper we obtain optimum estimates of nonobservable random 
variables or random processes which influence the rate functions of a 
discrete time jump process (DTJP). 

The approach we follow is based on the a posteriori probability of a 
noDobservable event expressed in terms of the a priori probability of that 
event and of the sample function probability of the DTJP. Thus, we obtain 
a general representation for optimum estimates and recursive equations 
for MMSE estimates. 

In general, MMSE estimates are nonlinear functions of the observations. 
We examine the problem of estimating the rate of a DTJP when the rate is 
a random variable with a probability density function of the form cx^{l“X)^ 
and show that the MMSE estimates are linear in this case. This class of 
density functions is rather rich and explains why there are insignificant 
differences between optimum unconstrained and linear MMSE estimates in 
a variety of problems. 
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I, INTRODUCTION 

Estimation and decision problems arising in communications and control 
have been studied in detail for continuous time observations. However» not much 
has been published for the case in which the observation process is a discrete 
time jump process (DTJP). We define a DTJP as a process having arbitrary 
jumps at times tj^, t^» . . . . -A more precise definition is given below. Segall [1] 
obtained some optimum estimates for the special case where the jumps are 
restricted to be unity b.y_usin.g .discrete time martingale techniques. In this 
paper we derive optimal estimates for more general cases. 

In Section 2 we define discrete time jump processes precisely, present 
some representations, and derive the likelihood function for an observed reali- 
zation. In Section 3 we derive the a posteriori probability measure for a 
nonobservable random process we wish to estimate given an observed realization 
of the DTJP. Recursive optimum estimation equations are derived in Section 4. 

The problem of optimum linear estimation is briefly discussed in Section 5. An 
interesting example in which the optimum estimates turn out to be linear is 
presented in Section 6. 
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. DEFINITION, REPRESENTATIONS, AND LIKELIHOOD FUNCTION 
FOR DISCRETE TIME JUMP PROCESSES 


We wish to describe an arbitrary discrete time jump process, taking 

t, 

values on a ^-dimensional Euclidean space R , by means of discrete time counting 
processes. This approach has been used in the context of processes with 
independent increments and in general continuous time jump process. 

Let T be the countable set 


T — f tQ, tj^, t^l . . i } 


where L is a real nvimber, i. e, t. eR, for i- 0, 1, 2, ... . Let Q be the set of all 

possible piecewise constant right continuous functions defined on R, taking 
I 

values on R , and having jumps in T only. An element OJcfJ will be called a sample 

A A 

function.. Define the variables Y. = Y(t, ) and y^= y(tj^) as 


Y (UJ) = value of UJ at time t = t. eT for lUefl , 
1 1 


Yq(u>)= 0; 


y (UJ)= Y (UJ) > Y. ^(IU)= jump size of W at time t= 


t. € T 
1 


I 


J 


I 


1 


Let If be the minimal sigma-algebra of subsets of fj such that all functions 

(Y.(w), t,e T) are measurable. Denote by P any probability measure on 3f. The 

triple (n,!f, P) will be called the discrete time jump process and will be denoted 

by Y, Since y. is - measurable function for all i^O, we define If^ to be the sub 

sigma algebra of generated by (y.(lU), ie [0, 1, • • • , k} }. For any Borel set A of 
X 

R , with 0^ A, define the random variables and n^(A) as 


N,(U),A)=i: I(Y.(iu) - Y. .(uJ)eA) 

^ 0<i£k " 


( 1 ) 


nj^{U),A) = N^((JJ,A) - N^_^(UJ,A) = I(y^(UJ)GA) (2) 

where !(• eA) is the indicator set function of the set A. In accordance with 
accepted usage, we shall drop the symbol lu and write Y^^, n^, etc. , for 

Yj^(u^), Nj^(lU), nj^(UJ), etc., respectively. Note that N^(A) represents the number 
of jumps of the process Y that fall in A during the time interval (t^< ^ * Thus 

vl X 

Nj^(A ) is a finite, nondecreasing, If ^^-measurable function of k. There- 
fore, (N^(A), k= 0, 1, . . . ) is a submartingale for any Borel set A<^R . The Doob 
decomposition for submartingSiIes, [ 2, Chapter VIl], implies that there exists a 
unique decomposition of terms of a (If^, Pj-marting'ale ^.nd a 

If, -measurable, increasing process II (A) withH (A) = 0 such that 

K w 


N^(A) = Oj^(A) +n^(A) ; k=0,l,2,... 


(:'3) 


F rom (1) - (3) we obtain, for k= 0, 1, 2, . , . , and any Borel set A 


+\(A) 


(4) 


q^(A) = Q^(A) - 




where 


Note that (qj^(A), k=0,l, ...) is a martingale difference sequence (MD). 
Jg.emark ^ The random variable ^ simple interpretation in terms 

of the conditional probability of a jump at time tj^. By taking the conditional 
expectation with respect to 3 . of both sides of { 4) we obtain 


TT^{A) = 


( 5 ) 


The Doob decomposition (4) has been defined here i) to model the 
process Y, ii) to guaranty the existance of P(yj^0Apj^ ^), and iii) for obtaining 
estimates of nonobservable events (Section 5). 

It is possible to represent the process (yj^.t^^eT) by means of the process 
nj^ defined in (2). The following lemma is a special case of a result given in 

Gikhman and Skorokhod[ 3, Chapter VI] and the proof will be omitted. 

A X 

Lety^(A) = y^I(yj^eA) = y^^n^(A) for any Borel set A<=-R with O^A. 

Then 

~ 1 = J ^ xrr^(dx) (6) 

A A A 

for k = 1,-2, ... . 

Note that y^(A)is the jump size of Yj^ provided that Y^-Y^^ j^eA. If 

i Z 

A = R -0, becomes Yj^(R - 0) = y^, with 

y^ = jxn^(dx) = j xqj^(dx) + J xTT^{dx) (7) 

i 

where the integration is on the space R with thevectorO excluded. The integrals 
in (6) and (7) are defined in the sense of Gikhman and Skorokhod [3, Section 3, 
Chapter VII] . 

I 

Remarjc "L If the space of all possible jumps of Y is countable, say ^(CR , with 
H - {U ,U_, • • • 3, the above representation reduces to 
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( 8 ) 


= I(yj^=U) for Ue^{ 


and 


where 


1^(0)= \(U) 

>'k ,“l %<“i> = ^ «i ^ "i *k'“i> 

1=1 1=1 1=1 


(9) 

( 10 ) 
( 11 ) 


In the estimation problem we will study later on, we will assume, for 
simplicity, a countable jump space y,. 

The likelihood function. The likelihood function is a quantity proportional to 

the probability of observing a particular realization of the jump process 

(Y.i t ^t. ^t, ) for t., t, £T and plays a fundamental role in estimation and 
' 1 0 1 k 1 k ^ 

decision problems [4] . 

We wish to find the likelihood function for a discrete time discrete 
amplitude jump process. Denote by pj^ - ^be probability of having a 

particular realization of Y, i. e. 


p, = P(y. = i=0, 1, . . • , k) 

K 11 

where §^€*U . Then 

Pk" P<''k°U’'rV 


where 


- n P(y.= y^ = 1 , i = 0, . , , . . ,i-l)P(yQ=5Q) 


P(yfe= gy;= ?i. i= 0,1. • • - .k-i) = P(y^= 


= P(n^(g=i|Sk.,) = yg 


( 12 ) 


(13) 


(14) 


Let ti , t. , * • ’ , be the jump times of the random process Y with jump 
‘'l ^2 

amplitudes » §s » • • • . Then n. (5- ) = 1, i - 1, 2, • • • , The probability of 
Jl J2 h h 

jump,nj^= 0 is given by 


no 


'X '1 


P(n^= 0ly^= i = k,2, • • • ,k-l) = 1-X^^ 


where 


x=l 


Therefore, the likelihood function (13) becomes, with PQ^P(yQ = §q), 


P^= Po" 

i=l ’■ 


(15) 


which can also be written as 


Pk = exp ^S^(^7i()i.(^))n.(§p + (^ (1- X^))(l- n.) j 


1 m 


m= 1 I 


( 16 ) 


A 

where we have used the fact that = 0., n. = F , n. (U ), and 

Ox m=l X m 

■^00 ^ 

E E ^ X. (U ) n. (U ) = E X. (5.) n.(§.) 

, x m xm xxxx 

x=l m=l x=l 


Rem^rl^S^ If the set of all jump amplitudes of Y is uncountable, 
then, by assuming that the limit 


X (x) = lim (tt (Ax )^,n ([x,x+A x)] 

max|Ax^|->0 ^ 


exists, where x - (Xj^, x^, • • • , x )GR , then the likelihood function is: 
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3. A POSTERIORI PROBABILITIES FOR ESTIMATION 






i 

I 



We shall formulate the estimation problem in a manner motivated by a 

problem in communication theory [5], [6]. Let X(t) be a nonobservable 
"signal " which is applied to the input of a general channel. Let y(t) be the 

output of the channel at time te T. We will assume that the observation record 

of the channel output at times * * * is a sample function of the discrete 

time jump process described in Section 2. Based on the observation record 

(y(cj), 0 si o St; tJ, teT) we wish to find an estimate of X, in particular, the 

minimum mean square error estimate. We formulate the problem in some 

convenient probability spaces in such a manner that the observations y can 

influence the "signals " X. 

Let (n ,P ) be a probability space called the "signal space "where 

5 S S 

the events B eB are nonobservable. Let (fl P {lu , • ) ), tu eO , be a 

ss mnims ss 

probability space called the "transfer space" where the probability measure is 

parameterized by the elements . The transfer space models the channel 

6 

behavior for each uu efl . We want to obtain statistical inferences about the 

s s 

nonobservable events B cB by observing events B eB . 

s s mm 

We will assume that the elements U) cQ are the sample functions of 

mm 

the discrete time jump process described in Section 2 . 

It is convenient to construct the product space (fl,B,p) where = 

B = B ^ fi and 
s m 

P(B xB )= J P (li) ,B^)P (dUJ ) (18) 

s m III s 111 s s 

Bs 

For example, let E be the event [y. = U-,y, = U,, • • • y, = U, } cB. This event 

m J, i. it ^ ic ic 

represents a particular realization of the discrete time jump process from t= t^ 
(y^ = 1) up to t= tj^. Then, from (16) 


I 


I 


J 


t 


p (lu ,E ) = P(U) , Y.,ie[ 0 , • • j k] ) = p, (UJ , UJ^ ) 
m' s’ m' b' i’ ’ ' ■* ^k s m ' 




V s m 


ism 


where we have indicated, explicitly, that the rates X. and X.(U ) depend on the 

"signal” element and the sample path from up to t^^ y i. e. • 

Let us define a new probability measure P^ on the transfer space, 

functionally independent of W and mutually absolutely continuous with respect 

s 

to P (UJ >•)• For the event E , we define 
m s m’ 


_k / V.{U , s- 

(E ) ^ exp'i S ( S !k{ "■ ^ n.(U ) - ^ — — . , ) \ 


1 m 


1 m 


where the rates Y.{U ) and Y* a-re not functions of lu . We define the likelihood 
1 n 1 s 

ratio L, (UJ , U) ) as 
k s m' 

. , . P (U) ,E ) 

^ » m pO,E ) 

m m 


. expfs (S ■ ■■ "'V - 

‘-i=l^n=l ^ V /TI 


(X.(U ;tu ,U)""^)(i-Y.(uu^"^ ) 

i ns m i m „„ , 

— j n.(x/ ) 

Y,{u )) ' " 

1 n m ism 


l-Y,(w^"S 


^ ’i^ m ’ X-| 

1-X.(UJ , UJ 

ism ' 
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The probability of any event given a sample path realization of the 

observation process can be calculated in terms of the likelihood ratio given 
in (21). In fact, we have 

Th^o^em^l. (Prior-to-posterior probability) 

Let ( f) , , P ) and (fi , B , P (tu , * )) be the signal and transfer spaces 

defined privjously. Let Lj^ be a likelihood ratio between 

Then 

I L, (W , lu )P (du) ) 

J k s' m' s s' 


P(B xn ls^0ff,) = 
s m 


B, 


o--k'“ r : — 

^ ^ L, (U) , UJ ) p (duj ) 

J k s’ m s s 


( 22 ) 




for every B sB , where S_ = ] . 

^ s s' 0 s s'* 

Proo^ . The proof of this theorem, in a more general context, is given in 
[6, section IV] 

Note that the right-hand side of (22) does not depend on P^ because 

m 

it can be written, alternatively, using (21), as 


P(B xn_ls„® 3_(t,.))=7 


1 

B. 


P (U1 , B )P (diu ) 


m s m s 


s m 0 m k [ p )P (d<« ) 

J m s’ m s s 

fi 




where B = E and ^ , = S* ® 3 (t, ), therefore P^ is a "ficticious" probability 

m m k 0 m k m 

measure used to prove the above theorem. 

Remark 4 (Conditional probability density^ Let the noncibservable random 

* * * * 


.n 


variable X be defined as X(U1 ) = a) e Q where Q = R , We want to obtain the 

s s s s 

conditional probability density function of X given 3^, For that purpose, let 
B = [X,X + AX). Then (23) becomes 

S''' 
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I 


1 


I 


4. RECURSIVE OPTIMUM ESTIMATES 

Let k=0,l, 2, , . . ) be an integrable random process defined on the 

A 

product space (fl, fi). We want to obtain the best estimate X by observing a 
sample path from t^ up to t^. The criteria is the minimum mean square 

error. 

It is well known that the conditional mean minimizes the mean square 
error. Therefore, the best estimate is 


X, ) = f X, (UJ ,uu" )P (d(U 1^ ) 
k m J, k s m s' ° 

“ s 


s n 


(26) 


Using (23), we can write (26) as 


n 


X, (U)" ) = 
k m 


EJp„(UJ .U)“ ) X (UJ ,U)‘ )) 
s * n s m K s m 


E (p (UJ ,l« )) 

s'^^n s m' 


(27) 


where, for simplicity we have defined 

(UJ ,uj")= p (UJ , B ) ; B = [E ,N = n] 


n s m 


m s m 


n n 


(28) 


Equation (27) is the best estimate of X^^ based on the observation path 


, n 


UJ fromt- up to t , and we have the following cases; 

i) smoothing estimate, if n>k(i. e. 

ii) filtering estimate, if n=k (i. e^ t = t, ) 

iii) prediction estimate, if n < k (i. e. t^ < t^^) 


For simplicity, we shall write (27) as 

^kln 




E (p ) 
s *^n 


(27') 
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I I 


I 


Note that X may not be B ®5 (t ) - measurable for k > n, which implies that 
K s m n 

X may not be 6 - measurable and (27) does not apply. However, if X is 

K s * k 

constant on Q , then, it is B measurable and (27) applies. 

m s » ' rr- 

Remark 5. Note that the random process X depends on both the signal UJ and 

^ ^ r r a Q 

the observations which implies that feedback is allowed, and (27) is the best 
estimate of X. 

Recursive filtering estimate . We wish to find a recursive formula for X givei 


in (27) for n=k. 

From (19) we see that 


r ffl / A, ^ 11 

Pk = Pk-1 ^ ) "k^^n^ 


(29) 


where we have dropped UJ and lU for simplicity. We prove now that the 

s m 

denominator of (27) satisfies 


Pk " ^s'Pk' " Pk-1 ^^‘P 


X.|. ,(U ) ' . 

ii-l' n 1 N 

— — n. (U ) - i.n “s 

1-A.l. . 
11-1 


w . ,{U ) 

1 i-l n' 


I 


(30) 


where X. . , (U ) = 
11 - 1 ' n 


. E (X.(U ) • p. ,) 
A s 1 n ^1-1 


^<Pi-i> 


, k=l, 2, • • • . 


For k=0 we have p =1 and 

1 J-V 


fi.x 

A I ^ / C 




if n^=0 

if n^(§) = 1 


(30') 


The last equality follows because Uj^=l if and only if there is a single jump of 
size §e'U at t=tj^. Notice that (30') can be written as 


Pj = Pq (X^(5)n^(§) +(l-Xj)(l-nj)) 


(31) 


13 


Taking the expectation with respect to E , dividing both sides of (31 ) by E (p_) 

® s u 


(32) 


(which is equal to 1), and using (27), we have 

EjIPq) - E^(pj) 1 E^(p„) 1 

Then E^(p^) = E^(p^)[Xj|j(5)nj(?) + (1-Xj|„(?) ) (1-n^)] 

or .'I = pQ ^ 

Since (31) and (32) satisfy the pame type of difference equation we conclude that 
Pj satisfies (30) with k=l. Using mathematical induction, it is easy to verify 
(30). The filtering estimate becomes 

(33) 


whe re 


E (p X ) 

X,,. = 1 . f = E (A,X, ) 


^kjk" E^(p^) 


s' k k' 


.-1 


I (r ^ ^i^kJi^ili-i^ 


n.(U ) 


Aj^MPk) Pk""^PLi5iVf=ri(.,. ru )ll-X.)V'^n' 1-X. / 

* * ili-l n 1 1 


= A, , exp I I ;s 77 / T TT ^ — v~ rT 

k-l Pl.n=l k n 1-k 


klk-in 

I 

k 




(34) 


where exponential formula, and ~ i* 

A 

Theorem 2 (Optimal filtering estimate). The optimum filtering estimate X.i, 
given in (33) satisfies the stochastic difference equation 


'kik " ^klk-1 


for k= 1,2, ... , where 


: \.i'Vk<"i'>-^klk-i^klk-i'“i'"'’kik-i'“i> - 
+ E « ^ ■' q 


^klk-l^'^i^^l-^klk-l) 


q (u.) 

k 


( 35 ) 
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i E 3 (X^Xk<"l>Pk-l> 


= E^(p - ;. 7 


1 J=l 

"" ^k-l[ ^k “ ^klk-lV^i^ 0 


= «k(Ui) - 

Proof. From (33) and (34 we have 

fW ^ 


^kik'®.'=^k''k> = ^="'k\-i^> 


But f'k " 


'klk-l 


\<”i> 

^klk.l'^i' 


• ”k " ° 


,n|^(U.)=l, U.et<, 1=1,2 


1 “ X, , ta \ ® 

— r (' - .^ ■'k'^iO + 

^"^klk-1 "■ 


«. X,(U,) 


Vi' 


^=1 ^Kjk-l(^i> 


n^(Ui) 


^■^klk-1 ^klk-l^^i^ ^^"^kjk-l^ 


n^(Ui) 


r ^kik-i^^i^^k " \^^i^^klk-i . 

+ S — 5 ?; ''k 

‘=‘ >‘klk-i'''i'“-Hik.i' 


Thus, using (39) we have: 





t 


1 


r 


By noting that = 5 X^(U.) and X^^.j = 5 


the right hand side of (4l) are equal to 




Mk-i'“i>‘‘-^k|k-i> 


'"ki'-’i' - ^k!k-i'"i" 


and, after some manipulation, the third term on the right hand side of (41 ) is 


equal to 


I T 
i=l j=l 


i-iL^k < ^k lk-i’"i”'k'”i’ ~ '‘kik-i'“.i’V"i’*. : 


^klk-i‘“i>''-^klk.i> 


ik<“i> 






where hjij^ j{U.) is defined in (37). Combining (42) and (43) and using (38) we 
obtain (35). 

Example 1 - l!? the observation process is a discrete counting process, i. e. , 

t(= [U^}= [l] , then X^(U^) = and (32) reduces to 


A A. 


" ^k!k-l^k!k-l '' 
X,j. - X I ^ *1]. 

kjk k|k-l a (l-X 1 ) 

^klk-1^^ ^klk-r 


This equation has been obtained by Segall [l ]• 

Example 2 - Let us assume that the nonobservable random process can be 
represented as 

Xk=f(k,Xk.i,u^_j)+w^ 

where f is a known function of X^^^ and of a liieasurable 
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then 


control u^; is a MD on the signal space, and is rot a function of 

can be interpreted as noise in the dynamics of Xj^. Then, the one step 

A 

prediction is 

" ^"klk-1 + ^s<"'k\..l> 

We will assume that w is a MD with respect to some sigma-algebra B (t, )C(} , 

iC S K. S 

and that A , is B (t .) measurable, then 

K • i S iC ^ X 

Thus ■ ^kjk-1 

Example 3 - Let us assume that the rate parameter X. is a fixed random variable 
X defined on the signal space, i, e. X^ = X = that X is uniformly 

^s 

distributed on [ 0, 1], The best estimate Xj^ at v.t e t=t^^ is given by; 


. . E (X^) - (X 

\ = ^k-1 + Z— i- <\ - \-i> ’ 


(44) 


^k-1 - 


where = ^. 


Note that in order to solve (44), we need to know ^(X ), which can be obtained 

3 

from another difference equation involving a term E^^ ^(X ) and so on up to 
infinity. Therefore (44) is not a closed form solution for the best estimate. 
However, this is a general characteristic of nonlinear estimation. 

Motivated by the problems of solving (44), we develop below a recursive 
formula for the conditional probability density Pj^(X). 
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Theorem 3 - Let p, (x ) be the conditional probability density of the random 
variable X = X(UJ^) given Let Xj^(U^.X), the rate of the jump process, for 

U.e^, be a know function of X. Then 


- XJU(.x) - X (U )t g (U ,x) , 

p, w = p, ,(=o[l + S ^ yUi) J 


i=l 


{45) 


where x) = ^j^k-l^U. , x) Xj^{x) - X^^(U. , x) 

Proof Let us consider the random variable y = then 


Ek( y) = E^_^{ y) 4 

/V. I . . l . 

1 


^kjk-l^^i^ ^^"^klk-1^ 


Since PyS’) i® the conditional probability density, then 


qk(U.) (46) 


Ek(y) = J exp(jvx)pj^(x)dx , for vSR 
Therefore, from (46) and (47) we get 

/ f ^ 

j exp(jvx)pj^(x)dx = I exp(jvx)pj^_^(x)dx 


(47) 


+ 2 r- 




[ Jexp(jvx)X^(n.)f^,jWdx- i^^_^(U;)Jexp(jvx)5^_j 


+ J e*P(iv*)(X^^.l(Ul)X^-Xjj^_jyu.))p^_j(x)dx] 


for any v6 R. Thus 


. “ ^k^^i^ “ ^kik-i^^i^ '^ ^kik-i^^i^V^kik-iV^i^ - 

Pk(-)=Pk.lW^Pk-l("^^, . mwr;:. ^ 

^ ^ ^klk-l^^i^ ^^”^k|k-l^ 


(x)dx 
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I 


I 


I 


4 


l 


where Pq(x) is the initial probability density function for the random variable X. 


Ex^mp^e^3 (cont. ) Let Xj^(U^,X) = = X be a uniformly distributed random 

variable, then (45) reduces to 


" Pk-1^^^ L ^ ^ 2 ^\“^k-iL’* k = i, 2, 

^k-r ^^k- 1 ^ 


(48) 


A 

where Xj^ 1 ” J ^^k 

Pq(x) = J 1 0 ^ X s 1 

I 0 ' otherwise 

A 

A 

Notice that knowing p^ ^^(x) we can find X^^ ^ which in turn allows us to find 

p (x), and so on, Thus we obtain a close form solution for all the conditional 
K 

A 

moment of X. It is straightforward to verify from (48) that X^ satisfies (44) 
Recursive Smoothing Estimate - We wish to find a recursive formula for the 
optimum smoothing estimate of fke random variable given for k < n. 

Th^ojrem^4^ (Optimal smoothing estimate). The optimum estimate for 

k< n and k fixed, satisfies the stochastic equation 


m 


O V E.(X,X.^,(U )) - Xj. X.,.i.(U )+h,|.(U ) 

X,| = X.I-, + 2 E i' k 1+1 m k|i i+lji m Iqi 

kin kjk i=k jn=l 


where 




( 50 ) 
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and q.„(U = U - X.,i. U 

^i+r m i+r m i+Jji m 

Pr^of The proof is very similar to that of Theorem E, In fact, the smoothing 
estimate is given by 


X,j = E (X, A 'f= E {X.h A ) 
hjn s' k n s' k n-1 n 


where 


1-X » „ ,TT ^ 

X = + r ""n^^m^ 

" 1-X. , Xi JU )(1-X| ,) 

njn-l njn-1 m n]n-l 


“ X - x„(u JX , 

, _ run-l mm n m nin-i . 

+ r — ^ ;; ^ 


Upon supstitution of (52) in (51), we obtain 


A A 


. . » E (X X^(U ))-X X ,{U )+h (U ) , 

- . -- I n**X ic n m jiC n*l nr).*l m. k n "*X ^ \ 

= ^ ^ ^ 

^ ^ W ri X m 2si •» I ^ 


““kin k|n-l m = i 




for n = kfl, kf2, • • • . Writing the stochastic equations for i’^kln 2’ " " ' 

etc. , we deduce (49). 

Recursive Prediction Estimajte - We wish to find a recursive equation for optimum 

prediction estimate X^ j of the random variable X^ given 5^ for n < k. We 

assume here that X, is fJ measurable. A sufficient condition for the B 

k s s 

measurability of X, is that X, be constant on . 

^ k k m 


Theorem 5 (Optimum prediction estimate). Assume that the random variable 
• 

A 

Xj^ measurable. The optimum prediction estimate ^ ^ 


fixed, satisfies the stochastic equation 






n-1 

+ Z 

i=0 


£ 

m=l 




)>;+,|,(U_)(l-X.x,|.) 


1+1 1 m 


.+j1 


i+j^i 




( 54 ) 


Where h,j.(U_) is defined in (50) 

Kji m 

Proof - The proof of this theorem is identical to that of Theorem 4 and will be 
omitted. 

A special case of Theorem 4 is for U= fUj^l= {l}. In this case, the 

A, 

recursive formula for becomes 


A 




n-i 

+ z 

i=k 


A A A 





(55) 


which has been derived by Segall [ 1]. The prediction estimate 
U = {ll , becomes 


^kjn " ^kjo 


n-1 
+ I 
i=0 




(56) 


I 


i 


I 


I 


I 


5. COMMENTS ON OPTIMUM LINEAR ESTIMATION 

r 

In this section we indicate how to obtain the best linear estimate X, i (U ) 

k|n m 

of the intensity function m=l, 2 , . , , , in the sense of minimizing the error 

i 2 

covariance function observing a sample path realization 

(nj^(U ), i= 1, 2, . . . , n), n ^ k, of a discrete time point process (n^^IU^)) which is 
obtained (see Section 2) from an arbitrary discrete time, discrete amplitude jump 
process (y,, i = 1, 2, , . . , n). As we discuss in Section 2, the Doob submartingale 

^ -t 

decomposition of m(U^) gives 


n.(U ) = X.(U ) + q.(U ) 
1 m i m m 


(57) 


for i=l,2,.,,,n; m=l,2,..., whereq.(U ) in a MD sequence, therefore 

1 m 

E(a 


= 0 for all i, j = 1, 2 n. 


The best linear estimate X i (U ) is of the form 


kj n' 


m 


n 


X, , (U ) = E(X (U )) + S H (U ){n (U ) - E(X (U ))) 
ic I n zn Ic m. • « lei m i xn i in 

' 1=1 


(58) 


where the unit response H (U ) is obtained from the orthogonality principle 

lex m 




(59) 


for k, j Sn, and m = 1, 2, ... . 

When is the state of a linear dynamical system, the Kalman filter 

can be used to recursively compute the optimum linear estimates. 


22 




i 


ii 

tj 






rf 

ill 


1 I I J J 1 

' t 

6. A CLASS OF PROBLEMS IN WHICH THE OPTIMUM ESTIMATES ARE 
LINEAR 

It is well knov/n that the unconstrained minimum mean- square error 
estimates of one set of random variables from another set are linear when the 
two sets are jointly normal. Few other examples are known where the optimum 
estimates are linear. In this section we present a problem for discrete time 
point processes in which the optimum estimates are linear. 

We will examine the problem of estimating the rate parameter X for a 
binary discrete time point process when X is a random variable with the prob- 
ability density function 


_ il 

t 3 


(k-fm+1) 
m ! k I 


k„ .m 
X (1-x) 


PvW= ^ 


for 0 s X s 1 


elsewhere 


( 60 ) 


where k and m are non-nefe.’'tive integers. Let us assume that the observed 
discrete time point process y^, n=l, 2, . . . is a sequence of binary numbers with 


P(y = ijX) = 1-P(y = 01X)=X 
n n 


and that it is an independent sequence conditioned on X, that is, 


( 61 ) 


n 


P(y =?.,i=l n|x)= n P(y =?Jx) 

1-1 


= X^ (1-X)"“® 


( 62 ) 
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where 


n 

S = Z 


i = 1 


From (25) it follows that 


P(Yi= 5j,,i=l, , . . ,nlX = x) 

Pjj(x ly^ = i = 1, . . . , n) = Pjj (x) -j 

1 P(y. = ^,i=l, ...,n|x=x)p (x)dx 

V IX ^ 


k+S , .m + n-S 

X (1 - x) 


h 


k+S .m+n - S , 
(1 - x) dx 


for 0 S X S 1 


(63) 


U sing the fact that 


j x^ (1 - x)^dx = m I n ! /(m+n+1) ! 


yields 


p^(x ly.,i=l» 


» (k+m+n+1) 1 

,n) = 


(k+S)! (m+n-S) I 


x^'*^(l-x)”^‘^"^ for O^xSl (64) 


Therefore, the minimum mean-square error estimate of X given • • • » 


A K 

X = E {X ly., . . . , y } = x p^ (x | y , i=l, . . . , n) dx 
n 1 n " -a 1 


n 


= (k+l+ S y,)/(n+k+m+ 2) 
i=l ^ 


(65) 
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r 


I 


I 


i 


i 


i j 


which is a linear estimate. This result is not at all obvious from the recursive 
estimation formula of Example 1 in Section 4. Notice that as n becomes large, 
the optimum estimate converges to the proportion of one's in the observed 
sequence. 

The optimum estimate is unbiased. This follows since 

E{X } = (k+l+nE[y.l )/(n+k+m + 2) (66) 

n X 

and 

E{y.l = E[E {y^lx}} = E(x} = (k+1)/ (k+m + 2) (67) 

BO that 

E[X } = E(X1 (68) 

n 

The linear minimum mean-square error estimate can also be derived by 
appealing to the Doob decomposition and expressing the observations as 
y. = X + q.. The sequence q. = y. - X is a martingale difference sequence with 

E[q.q.}= E[E{(y. - X)^ \x] } 6 .. = E^Xd-X)} 6 .. 


^ (m+l)(k+l) g 

(m+k + 2)(m+ k+3) ij 

The observations can be arranged in the matrix form 


— » 




— — 



1 


'll 

''2 


1 


‘12 

• 

=: 

• 

X + 

• 



• 


• 

• 


• 


• ' 

y 


1 


‘I., 

n 




n 

_ _ 


- - 


.. . 


(69) 


(70) 


Y 


A X + Q 


(70 ') 


I 


I 


J 


r 


Then the optimum linear estimate is [7,Ch.l3] 

= E {X} + (A^ r‘^A + V"V^ A^ (Y - E[y} ) 

where R = cov Q and V = var X. This reduces to the conditional mean X 
derived above. The corresponding mean-square error is 


n 


(71) 


Et(x.x/l= (a‘r-‘a 


(k+m+2) (k+m+ 3)(n+k+ m + 2) 


(72) 


Thv=. Kalman filter [8] can be used to obtain the optimum linear estimates 

recursively. If we consider X to be the state of the dynamical system 

X , = X with X- = X, then the observations are y = X +q and the Kalman 
n+1 n 0 n n n 

filter equations become 


n n-1 n n n n-i 


(73) 


and 


r = r ( var q )/ (r + var q ) 
n n-1 n n-1 n 


(74) 


with the initial conditions 


Xq = E fX } and = var X 


(75) 


The mean- square error at time n is T , 

n 

In computer simulations of the optimum nonlinear recursive estimator given 
by (44) and (48) and the optimum linear recursive estimator of previous paragraph 
with probability density functions for X not belonging to the class in this section, 
we found only small differences in the estimates. This can be explained by the 
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fact that after a few steps the a posteriori probability density function becomes 
peaked about the true value of X and can be closely approximated by a density 
of the class in this section. Then the two estimates become nearly identical. 
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